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Abstract 

We discuss and compare two methods of investigations for the asymptotic regime of 
stochastic differential games with a finite number of players as the number of players 
tends to the infinity. These two methods differ in the order in which optimization and 
passage to the limit are performed. When optimizing first, the asymptotic problem is 
usually referred to as a mean-field game. Otherwise, it reads as an optimization problem 
over controlled dynamics of McKean-Vlasov type. Both problems lead to the analysis of 
forward-backward stochastic differential equations, the coefficients of which depend on 
the marginal distributions of the solutions. We explain the difference between the nature 
and solutions to the two approaches by investigating the corresponding forward-backward 
systems. General results are stated and specific examples are treated, especially when 
cost functionals arc of linear-quadratic type. 

1 Introduction 

The problem studied in this paper concerns stochastic differential games with a large num- 
ber of players. We compare two methods of investigation which offer, in the asymptotic 
regime when the number of players tends to infinity, a structure which is simple enough to 
be amenable to actual solutions, both from the theoretical and numerical points of view. 

In order to derive tractable solutions, we assume that all the players are similar in their 
behavior, and that each individual on his own, can hardly influence the outcome of the game. 
We further strengthen the symmetry of the problem by assuming that the interaction between 
the players is of mean field type in the sense that whenever an individual player has to make 
a decision, he or she sees only averages of functions of the private states of the other players. 
These games are symmetric in the sense that all the players are statistically identical, but 
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they are not anonymou^ (see for example ^14j) or with weak interaction in the sense of |13] . 
In the large game limit, a given player should feel the presence of the other players through 
the statistical distribution of the private states of the other players, and should determine his 
optimal strategy by optimizing his appropriately modified objective criterion taking the limit 
N ^ oo into account. The search for an approximate Nash equilibrium of the game in this 
asymptotic regime is dictated by the Mean Field Game (MFG for short) proposition of Lasry 
and Lions. See for example [El [H [HI [HI [H [U] . 

Without the optimization part, and when all the players use the same distributed feed- 
back strategy as suggested by the symmetry of the set-up, the large population regime is 
reminiscent of Mark Kac's propagation of chaos theory put on rigorous mathematical ground 
by McKean and known under the name of McKean-Vlasov (MKV for short) theory. See 
for example Sznitman's beautiful mathematical treatment [22] . Indeed, according to these 
works, one expects that, in the limit N — )• oo, the private states of the individual players 
evolve independently of each other, each of them satisfying a specific stochastic differential 
equation with coefficients depending upon the statistical distribution of the private state in 
question. Having each player optimize his own objective function under the constraints of this 
new private state dynamics amounts to the stochastic control of McKean-Vlasov dynamics, 
mathematical problem not understood in general. See nevertheless [l] for an attempt in this 
direction. Coming back to our original problem we may wonder if the optimization over the 
feedback strategies in this new control problem leads to one form of approximate equilibrium 
for the original player game. In any case, if such a problem can be solved, the issue is to 
understand how an optimal feedback strategy for such a problem compares with the result of 
the MFG analysis. 

The main thrust of this paper is to investigate the similarities and the differences between 
the MFG approach to stochastic games with mean field interactions, and the search for an 
optimal strategy for controlled McKean-Vlasov stochastic differential equations. In Section 
[21 we give a pedagogical introduction to these two very related (and sometimes confused) 
problems with the specific goal to explain the differences between their nature and solutions; in 
Section [6l we also provide the reader with a short guide for tackling these problems within the 
framework of Forward-Backward Stochastic Differential Equations (FBSDEs) of the McKean- 
Vlasov type. 

The remaining part of the paper is devoted to the discussion of special classes of models 
for which we can address the existence issue and compare the solutions of the two problems 
explicitly. We give special attention to Linear Quadratic (LQ) stochastic games because of 
their tractability. Indeed, in this case, the mean field character of the interaction is of a 
very simple nature as it involves only the empirical mean and the empirical variance of the 
individual states, and they both enter the coefficients linearly. For these models, we implement 
the MFG approach and analyze the optimal control of MKV dynamics in Sections [3| and [jpl 
respectively. While bearing some similarity to the contents of this section, the results of [2] 
on some linear quadratic MFGs are different in the sense that they concern some infinite 
horizon stationary cases without any attempt to compare the results of the MFG approach 

^In the anonymous framework, dynamics are given for the statistical distribution of the population so that 
the private dynamics of the players are not explicit. 

^After completion of this work, we were made aware of the appearance on the web of a very recent technical 
report by A. Bensoussan, K. C. J. Sung, S. C. P. Yam, and S. P. Yung entitled Linear Quadratic Mean Field 
Games. In this independent work, the authors present a study of linear quadratic mean field games in relation 
to the control of McKean-Vlasov dynamics very much in the spirit of what we do in Section [3] and Section 
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to the control of the corresponding McKean-Vlasov dynamics. We provide more exphcit 
examples in Section [5l including a simple example of CO2 emissions regulation which serves 
as motivation for models where the interaction is of mean field type and appears only in the 
terminal cost. 

While the MFG approach does not ask for the solution of stochastic equations of the 
McKean-Vlasov type at first, the required fixed point argument identifies one of these so- 
lutions of standard optimal control problems as the de facto solution of an FBSDE of the 
McKean-Vlasov type as the marginal distributions of the solution appear in the coefficients 
of the equation. Since the McKean-Vlasov nature of these FBSDEs is rather trivial in the 
case of LQ mean field games, we devote Section [6] to a discussion of the construction of solu- 
tions to these new classes of FBSDEs which, as far as we know, have not been studied in the 
literature. 



2 Stochastic Differential Game with Mean Field Interactions 

We consider a class of stochastic differential games where the interaction between the players 
is given in terms of functions of average characteristics of the private states and actions of 
the individual players, hence their name Mean Field Games. In our formulation, the state of 
the system (which is controlled by the actions of the individual players) is given at each time 
t by a vector Xt = {X^, ■ ■ ■ ,X[^) whose N components X^ can be interpreted as the private 
states of the individual players. A typical example capturing the kind of symmetry which we 
would like to include is given by models in which the dynamics of the private states are given 
by coupled stochastic differential equations of the form 

1 ^ 

= J^Y1 ^(*' ^t'^t , ctt)dt + adWl (1) 

where 6 is a function of time, the values of two private states, and the control of one player, and 
((W^*)t>o)i<i<7V independent Wiener processes. For the sake of simplicity, we assume that 
each process (^t)o<t<T is univariate. Otherwise, the notations become more involved while 
the results remain essentially the same. The present discussion can accommodate models 
where the volatility o" is a function with the same structure as b. We refrain from considering 
this level of generality to keep the notations to a reasonable level. We use the notation 
at = (a|, • • • , a^) for the players strategies. Notice that the dynamics ([T]) can be rewritten 
in the form: 

dXi = b{t, XlTif, al)dt + adWi (2) 

if the function b of time, a private state, a probability distribution on private states, and a 
control, is defined by 

b{t,x,^,a)= / b{t,x,x' ,a) dfi{x') (3) 
Jr 

and the measure JI^ is defined as the empirical distribution of the private states, i.e. 

N 
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Interactions given by functions of the form ([3]) will be called linear or of order 1. We could 
imagine that the drift of ([1]) giving the interaction between the private states is of the form 

1 ^ 

^ b(t,Xl,Xl,X^,c4) 
j,k=i 

which could be rewritten, still with fi = Jlf , in the form 

b{t,x,fi,a)= / b{t,x,x' ,x" ,a) d^{x')dfi{x"). (5) 
Jr 

Interactions of this form will be called quadratic or of order 2. Clearly, one can extend this 
definition to interactions of all orders, and more generally, we will say that the interaction 
is fully nonlinear if it is given by a drift of the form b{t, Xl,]I^ ,at) for a general function b 
defined on [0, T] x M x ■Pi(M) x A where Vi{M) denotes the set of probability measures on the 
real line and A the space in which, at each time t G [0, T], the controls can be chosen by the 
individual players. In general, we say that the game involves interactions of the mean field 
type if the coefficients of the stochastic differential equation giving the dynamics of a private 
state depend upon the other private states exclusively through the empirical distribution of 
these private states - in other words if the interaction is fully nonlinear in the sense just 
defined - and if the running and terminal cost functions have the same structure. 

Remark 1. As in the case of several of the models discussed later on, the dependence upon 
the empirical distribution fl^ of the private states can degenerate to a dependence upon some 
moments of this distribution. To be more specific, we can have 

b{t,x,n,a) = b{t,x,{ip,fi),a) (6) 

for some scalar function if of the private states, where we use the duality notation ((/?, /x) = 
Jjg ip{x')dfi{x') for the integral of a function with respect to a measure. In such a case, we 
shall say that the interaction is scalar. 



To summarize the problem at hand, the game consists in minimizing simultaneously costs 
of the form 



J' (a) = E 



fit,Xl]I^,al)dt + g{XT,Jl'^ 



,N, 



under constraints of the form 

dXi = b{t,Xnif,ai)dt + adWi, 



0<t<T, 



(7) 



(8) 



where the W ' are independent standard Wiener processes. (In the whole paper, we use the 
underlined notation Z to denote a stochastic {Zt)t^i indexed by time t in some interval /.) 
Note that for symmetry reasons, we choose the running and terminal cost functions / and g 
to be the same for all the players. Our goal is to search for equilibriums of such a stochastic 
differential game. 
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2.1 Optimization Problems for Individual Players 

Given that the problem is intractable in most cases (see nevertheless the discussion of the 
linear quadratic case in Sections [3] and H] and the description of general results toward a 
systematic treatment in Section [6]) , we try to identify realistic models for which approximate 
equilibriums and optimal strategies can be identified and computed. 

For the sake of definiteness, we restrict ourselves to equilibriums given by Markovian 
strategies in closed loop feedback form 

at = {(l)\t,Xt),--- ,(l)^{t,Xt)), 0<t<T, 

for some deterministic functions (p^, • • • , (p^ of time and the state of the system. Further, we 
assume that the strategies are of distributed type in the sense that the function giving the 
strategy of player i depends upon the state Xt of the system only through the private state 
XI of player i. In other words, we request that: 

al = 4>'{t,Xl), i = l,---,N. 

Moreover, given the symmetry of the set-up, we restrict our search for equilibriums to situa- 
tions in which all the players use the same feedback strategy function, i.e. 

<pHt, ■) = --- = (t>''{t, ■) = Ht,-), o<t<T, 

for some common deterministic function (p. The search for an equilibrium with a fixed finite 
number of players involves optimization, and the exact form of the optimization depends 
upon the notion of equilibrium we are interested in. In any case, we hope that, in the large 
game regime (i.e. when we let the number of players go to oo), some of the features of the 
problem will be streamlined and the optimization, if feasible, will provide a way to construct 
approximate equilibriums for the game with a finite (though large) number of players. We 
show that, depending upon the nature of the equilibrium we are aiming for, the effective 
problem appearing in the limit can be of one form or another. 



2.2 Search for Nash Equilibriums: Optimizing First 

If we search for a Nash equilibrium when the number of players is fixed, each player i 
assumes that the other players have already chosen their strategies, say aj* = (/)^*{t,X^), ■ ■ ■ 
, ^i-i* ^ (j)i-'^*{t,Xl-'^), aj+i* = ^'+^*{t,Xi+^), ■■■ , * = (t)^*{t,X^), and under this 
assumption, solves the optimization problem: 



arg min E 



r fit, XlTlf, <p{t, Xl))dt + g{Xir,ll^) 
Jo 



(9) 



when the dynamics of his own private state is controlled by the feedback control = (j){t, XI) 
while the dynamics of the private states of the other players are controlled by the feedbacks 
al = (f)^*{t, Xl) for j = 1, • • • ,i — l,i + \, - ■ ■ ,N. So when writing an equation for a critical 
point (typically a first order condition), we apply an infinitesimal perturbation to (p* without 
perturbing any of the other (p^* for j 7^ i, and look for conditions under which this deviation 
for player i does not make him better off when the other players do not deviate from the 
strategies they use. Also, recall that we are willing to restrict ourselves to optimal strategies 
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of the form (p^* = cjp'* = • • • = (f)^* because of our symmetry assumption. As a result of this 
exchangeability property of the system, even though a perturbation of (jf could in principle 
affect the dynamics of all the private states by changing the empirical distribution of these 
private states, however, especially when the number of players N is large, a form of the law of 
large numbers should make this empirical measure Jl^ quite insensitive to small perturbations 
of (j)* . So for all practical purposes, the optimization problem Q can be solved as (or at least 
its solution can be approximated by) a standard stochastic control problem once the family 
(jlf)Q<t<T of probability measures is fixed. 

So in this approach, the search for an approximate Nash equilibrium is based on the 
following strategy: first one fixes a family {fit)o<t<T of probability measures, next, one solves 
the standard stochastic control problem (parameterized by the choice of the family (/it)o<t<T): 



<j)* = argminE 
subject to the dynamic constraint 



fit, Xt,fit, cj){t, Xt))dt + g{XT, IJiT) 







(10) 



dXt = b{t,Xt,^lt,<|){t,Xt))dt + adWt, (11) 

for some Wiener process {Wt)o<t<T- Once this standard stochastic control problem is solved, 
the remaining issue is the choice of the flow {fJ,t)o<t<T of probability measures. This is where 
the limit N ^ oo comes to the rescue. Indeed, the asymptotic independence and the form of 
law of large numbers provided by the theory of propagation of chaos (see for example p2]) tell 
us that, in the limit N ^ oo, the empirical measure Jl^ should coincide with the statistical 
distribution of Xf. So, once an optimum cp* is found for each choice of the family {iJ,t)o<t<T, 
then the family of measures is determined (typically by a fixed point argument) so that, at 
each time t, the statistical distribution of the solution Xt of (llip is exactly fif 

The role of the limit — )• oo (large number of players) is to guarantee the stability of the 
empirical measure JI^ when a single player perturbs his strategy while the other players keep 
theirs unchanged, and the fact that this stable distribution has to be the common statistical 
distribution of all the private states XI . Performing the optimization over (j) when the family 
ilJ't)o<t<T of probability measures is kept fixed is the proper way to implement the notion of 
Nash equilibrium whereby each player is not better off if he deviates from his strategy while 
all the other players keep theirs untouched, as implied by the lack of change in {fJ-t)o<t<T and 
hence (/Zf )o<t<T- 

From a mathematical point of view, the limit A — )• oo can be justified rigorously in the fol- 
lowing sense: under suitable conditions, if there exists a family of measures {iJ.t)o<t<T together 
with an optimally controlled process in (jlOHlip with {fJ-t)o<t<T as marginal distributions ex- 
actly, then the corresponding optimal feedback (p* is proven to provide an e-Nash equilibrium 
to the A'-player-game ([9|) for A large. (See Oil].) The argument relies on standard theory of 
propagation of chaos for McKean-Vlasov diffusion processes. (See [22] •) As a by-product, the 
empirical measure of the A-player system, when driven by (p* , does indeed converge toward 
ilJ't)o<t<T- In such cases, this makes rigorous the MFG approach for searching for an approx- 
imate Nash equilibrium to the A-player-game. Still the converse limit property seems to be 
much more involved and remains widely open: do the optimal states of the A^-player-game 
(if exist) converge to some optimal states of ([TOHTT]) as A tends to oo? 
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Assuming that the family (j. = {pt)o<t<T of probabihty distributions has been frozen, the 
Hamiltonian of the stochastic control problem is 

H^' (t, X, y, a) = yh{t, x, fxt, a) + f{t, x, fxt, a). (12) 

Recall that we are limiting ourselves to the case of constant and identical volatilities a for the 
sake of simplicity. In all the examples considered in this paper, there exists a regular function 
a satisfying: 

a = a''* : [0,r] x M x R 9 {t,x,y) ^ a{t,x,y) E argmin i7''*(t, x, y, a). (13) 

We denote by ?^^*(t, x,y) this infimum: 

'H'''{t,x,y) = inf H^'{t,x,y,a). (14) 

In this paper, we solve stochastic control problems using the probabilistic approach based 
on the Pontryagin minimum principle and solving the adjoint forward-backward stochastic 
differential equations. For the sake of completeness, we review the Lasry-Lions' approach 
using the Hamilton-Jacobi-Bellman (HJB for short) equation leading to the solution of a 
forward-backward system of nonlinear Partial Differential Equations (PDEs for short). 



HJB / PDE approach 

Since // is frozen, the stochastic control problem is Markovian and we can introduce the HJB 
value function: 



vit, x) = inf E 

a&At 



f{s,Xs,fJ.s,as)ds + g{XT,fiT)\Xt 



(15) 



where At denotes the set of admissible controls over the interval [t,T]. We expect that the 
HJB value function v is the solution in some sense (most likely in the viscosity sense only) of 
the Hamilton-Jacobi-Bellman (HJB) equation (see [lOj) 

2 

dtv + + 'H'''{t,x,dxv{t,x)) =0, (t, x) G [0, T] x E, (16) 

with terminal condition v(T,x) = g{x,fiT), x E M. 

In the end, we would like ^ = {fit)o<t<T to be the flow of marginal distributions of the 
optimally controlled private state. Clearly, this sounds like a circular argument at this stage 
since this stochastic differential equation actually depends upon /i, and it seems that only 
a fixed point argument can resolve such a quandary. In any case, the flow of statistical 
distributions should satisfy Kolmogorov's equation. In other words, if we use the notation 

(3{t,x) = b{t,x,fit,(l)it,x)) 

where 4> is the optimal feedback control (think of (f){t,x) = a'^^{t,x,dxv{t,x))), then the flow 
(^'t)o<t<r of measures should be given by Uf = C{Xt) and satisfy Kolmogorov's equation 

2 

dtv-'^dl^v-d\N{P{t,x)v) = Q, (t,x) G [0,T] X M, (17) 
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with initial condition z/q = ^o- This PDE can be given a rigorous meaning in the sense of 
distributions. When i/f has a smooth density, integration by part can be used to turn the 
formal PDE (|17|) for v into a classical PDE for the density of z^. 

Setting = C{Xt) = fit when (Xt)o<t<T is the diffusion optimally controlled by the 
feedback function (j) gives a system of coupled nonlinear forward-backward PDEs ()16p -(|17 p . 
called the MFG PDE system. See [El [Ml HZl Hg [H E] . 

Stochastic Maximum Principle / FBSDE Approach 

Given the frozen flow of measures /i, the stochastic optimization problem is a standard stochas- 
tic control problem and as such, its solution can be approached via the stochastic Pontryagin 
principle. For each open-loop adapted control a = {at)o<t<T, we denote by X— = (Xf^)o<t<T 
the associated state; any solution (Yt, Zt)o<t<T of the BSDE 



dYt = -d^H^'^ {t, Xt, Yt, at)dt + ZtdWt, t G [0, T] ; Yt = d^g{XT,fiT). (18) 



is called a set of adjoint processes and the BSDE ([T8]) is called the adjoint equation. The nec- 
essary condition of the stochastic Pontryagin principle says that, whenever X— is an optimal 
state of the optimization problem, it must hold H^^{t,Xt,Yt,at) = 7i'^*{t,Xt,Yt), t € [0,T]. 
(See [25j.) Conversely, when the Hamiltonian /f^* is convex with respect to the variables (x, a) 
and the terminal cost g is convex with respect to the variable x, the forward component of 
any solution to the FBSDE 



with the right initial condition for Xq and the terminal condition Yt = dxgiXx, fix), is an 
optimally controlled path for (jlOp . In particular, if we want to include the matching condition 
of the MFG approach as outlined earlier, namely if we want to enforce nt = C{Xt), the above 
stochastic forward-backward system turns into 



with the right initial condition for Xq and with the terminal condition Yt = dxg{XT, C{Xt))- 
This FBSDE is of the McKean-Vlasov type as formally introduced in Subsection 16.11 If 
convexity (as described above) holds, it characterizes the optimal states of the MFG problem. 
If convexity fails, it provides a necessary condition only. 

2.3 Search for Cooperative Equilibriums: Taking the Limit N ^ oo First 

As before, we assume that all the players use the same feedback function 0, but the notion of 
optimality is now defined in terms of what it takes for a feedback function (/>* to be critical. 
In the search for a first order condition, for each player i, when cf)* is perturbed into (p, we 
now assume that all the players j ^ i also use the perturbed feedback function. In other 
words, they use the controls al = (p{t, A^) for j = 1, • • • , i — 1, i + 1, • • • , A. In this case, 
the perturbation of (/>* has a significant impact on the empirical distribution Jl^ , and the 
optimization cannot be performed after the latter is frozen, as in the case of the search for 



dXt = b{t, Xt,ixt, a^' {t, Xt, Yt))dt + adWt, 
dYt = -d^m^ {t, Xt, Yt, a^' {t, Xt, Yt))dt + ZtdWt 



t, 



dXt = b{t,Xt,C{Xt),a^^^'\t,Xt,Yt))dt + adWt, 
dYt = -dxH^^^^'^ {t, Xt, Yt,a^^^'\t, Xt,Yt))dt + ZtdWt 



t, 



(19) 
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a Nash equilibrium. In the present situation, the simphfication afforded by taking the hmit 
— >• oo is required before we perform the optimization. 

Assuming that the feedback function is fixed, at least temporarily, the theory of propaga- 
tion of chaos states that, if we consider the solution = [X^'^, • • • , X^'^) of the system of 
N stochastic differential equations ([8]) with al = (f){t, X^'^), then in the limit N — >• oo, for any 
fixed integer k, the joint distribution of the /c-dimensional process {{X^'^, ■ ■ ■ , xf''')}o<t<T 
converges to a product distribution (in other words the k processes {xf'^)o<t<T for i = 
1, • • • ,k become independent in the limit) and the distribution of each single marginal pro- 
cess converges toward the distribution of the unique solution 2L = {Xt)o<t<T of the McKean- 
Vlasov evolution equation 

dXt = b{t,Xt,CiXt),cl){t,Xt))dt + adWt (20) 

where {Wt)o<t<T is a standard Wiener process. So if the common feedback control function 
(j) is fixed, in the limit — )• oo, the private states of the players become independent of each 
other, and for each given i, the distribution of the private state process {X^'^)o<t<T evolving 
according to dSD converges toward the distribution of the solution of ([2U|) . So if we optimize 
after taking the limit — )• oo, i.e. assuming that the limit has already been taken, the 
objective of each player becomes the minimization of the functional 



J{(j)) = E 



T 



fit, Xt, C{Xt),4>{t, Xt))dt + g{XT,C{XT)) 



(21) 



over a class of admissible feedback controls 0. Minimization of ()21|) over (j) under the dynam- 
ical constraint (|20p is a form of optimal stochastic control where the controls are in closed 
loop feedback form. More generally, such a problem can be stated as in (llOp for open loop 
controls a = (ai)o<t<T adapted to any specific information structure: 



a* = arg min E 



/ f(t,Xt,C{Xt),at)dt + g{XT,C{XT)) 
Jo 

subject to (22) 
dXt = b{t,Xt,C{Xt),at)dt + adWt, t e [0,T]. 

Naturally, we call this problem the optimal control of the stochastic McKean- Vlasov dynamics. 
Here as well, the limit — )• oo can be justified rigorously: under suitable conditions, optimal 
feedback controls in (|22p are proven to be e-optimal controls for the A^-player-system when 
N is large, provided that the rule we just prescribed above is indeed in force, that is: any 
perturbation of the feedback is felt by all the players in a similar way. (See |5].) 

Standard techniques from stochastic control of Markovian systems cannot be used for 
this type of stochastic differential equations and a solution to this problem is not known 
in general, even though solutions based on appropriate analogs of the Pontryagin minimum 
principle have been sought for by several authors. We first review the results of yy which are 
the only published ones which we know of. They only concern scalar interactions for which 
the dependence upon the probability measure of the drift and cost functions are of the form 

b(t, X, /i, a) = h{t, X, {Tp, /i), a), f{t, x, fi, a) = f{t, x, (7, fi),a), g{x, fi) = g{x, {(, /u)) 

for some functions ip, 7, and C. Recall that we use the duality notation ((/?, fj,) to denote 
the integral of the function (f with respect to the measure fi. Notice that we use the same 
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notations b, f and g for functions where the variable ^, which was a measure, is replaced by 
a numeric variable. As we explained earlier, this setting could be sufficient for the analysis 
of most of the linear quadratic models we consider in Sections [3] and HI We review the more 
general results [6j of the first two named authors in Section [6j 

The major difference with the classical case is in the form of the adjoint equation. Given 
a control process a = {at)o<t<T and a process X = {Xt)o<t<T satisfying 

dXt = b{t, Xt, C{Xt),at)dt + adWt, < t < T, (23) 

a pair of processes y = (It )o<t<T and Z = {Zt)o<t<T is said to form a pair of adjoint processes 
if they satisfy: 

'dYt = - [d^Mt,XtMHXt)},at)Yt + dJ{t,Xt,EMXt)},at)]dt + ZtdWt 

< -[E{a,.6(i,Xt,E{V'(Xt)},at)yaa,V(^t) + E{a,./(t,Xt,E{7(Xt)},aO}5a.7(^t)]'^i (24) 
Yt = d^g{XT,E{C{XT)})+nd.'g{XT,m(.XTmdMXT). 

This BSDE, which is of the McKean-Vlasov type as its coefficients depend upon the distri- 
bution of the solution, is called the adjoint equation. It provides a necessary condition for 
the optimal states of the MKV optimal control problem, that is, for any optimally controlled 
path, it must hold: 

H{t, Xt,Yt, at) = inf H{t, Xt,Yt, a) (25) 

for all t G [0, T]. The Hamiltonian H appearing in (I25p is defined, for any random variable ^, 
as: 

H{t, e, y, a) = yb{t, e, nm}, «) + fit, ^, n^iO), «)• (26) 

This result is proven in [IJ (with the assumption that A is convex and that H is convex in 
the variable a). The sufficiency condition is given by (see [T] as well): 

Theorem 2.1. Assume that 

(Al) g is convex in (x,x'); 

(A2) the partial derivatives d^'f and d^'g are non-negative; 

(A3) the Hamiltonian H is convex in (x, x'^, X2, q); 

(A4) the function Tp is affine and the functions 7 and C are convex; 

where the Hamiltonian function H is defined as: 

H(t,x,x[,X2,y,a) = yb{t,x,x[,a) + f{t,x,x'2,a). (27) 

If 2L= iXt)o<t<T satisfies ([23]) for some control process a = {at)o<t<T, if Y = iYt)o<t<T 
and Z = {Zt)o<t<T form a pair of adjoint processes and if (|25|) holds for all t G [0, T] (almost 
surely), then the control a = {at)o<t<T is optimal. 

2.4 Summary 

The dichotomy in purpose suggested by the two notions of equilibrium discussed above points 
to two different paths to go from the North East corner to the South West corner of the 
following diagram, and the thrust of the paper is to provide insight, and demonstrate by 



10 



examples that which path one chooses has consequences on the properties of the equihbrium, 
in other words, the diagram is not commutative. 



SDE State Dynamics Optimization Nash Equilibrium 

for N players for N players 

limjv^oo\l' \|/limjv^oo 

Optimization Mean Field Game? 

McKean Vlasov Dynamics Controlled McK-V Dynamics? 

It is important to emphasize one more time what we mean by the limit — t- oo. We want 
to identify properties of the limit which, when re-injected into the game with finitely many 
players, give an approximate solution to the problem we are unable to solve directly for the 
stochastic game with N players. 

First Discussion of the Differences 

In the present context where we assume that the volatility a is constant, the general form of 
MKV dynamics is given by the solution of non-standard stochastic differential equations in 
which the distribution of the solution appears in the coefficients: 

dXt = b{t, Xt, C{Xt), at)dt + adWu Xq = x, (28) 

where a is a positive constant. For a given admissible control a = {at)o<t<T we write X— for 
the unique solution to (I28p . which exists under the usual growth and Lipchitz conditions on 
the function b, see [22j for example. The problem is to optimally control this process so as to 
minimize the expectation: 



JuKvia) = IE 



fit, Xf, C{Xf),at)dt + g{X^, £(Xf )) 



(29) 



On the other hand, the dynamics arising in the MFG approach are given by the solution of 
a standard stochastic differential equation: 

dXt = b{t, Xt,fit, at)dt + adWt, Xq = x, (30) 

where ^ = {fJ.t)o<t<T is a deterministic function with values in the space of probability mea- 
sures, which can be understood as a (temporary) proxy or candidate for the anticipated 
statistical distribution of the random variables Xt . The expectation that has to be minimized 
is now: 



-^MFG(a) = E 



/ fit,Xf,fit,at)dt + g{Xf,fiT) 
Jo 



(31) 



and an equilibrium takes place if, for an optimal control a, the anticipated distribution fit 
actually coincides with C{Xf-) for every t G [0,T]. While very similar, the two problems differ 
by a very important point: the timing of the optimization. 

In the MKV control problem (I28p -(j29p. at each time t, the statistical distribution fj,t is 
matched to the distribution of the state, and once we have fit = C{Xt), then we do optimize 
the objective function. On the other hand, for the MFG problem ()30p - (j3ip . the optimization 
is performed for each fixed family fi = {fit)o<t<T of probability distributions, and once the 
optimization is performed, one matches the distributions fit to the laws of the optimally 
controlled process. 
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3 Mean Field Linear Quadratic (LQ) Games 



This section is devoted to a class of models for which we can push the analysis further, to the 
point of deriving explicit formulas in some cases. We choose coefficients of the linear-quadratic 
type, that is 

b{t, X, fj,, a) = atx + atjl + bta + (3^, 

f{t,x,n,a) = ^nta"^ + ^{mtx + rutjltf , gix,fi) = ^{qx + qjlf 

where at, at, ht, fit, mt, fnt and rit are deterministic continuous functions of t E [Oj^]) and q 
and q are deterministic. Recall that we use the notation /Z for the mean of the measure //, i.e. 
71 = / xdfi{x). 

3.1 Solving the A^-player Game 

Under specific assumptions on the coefficients, existence of open loop Nash equilibriums for 
Linear Quadratic (LQ for short) stochastic differential games of the type considered in this 
section can be proven using the stochastic Pontryagin principle approach. See for example 
|12j . We re-derive these results in the case of mean field models considered in this paper. 

As in our introductory discussion, we assume that individual private states are one dimen- 
sional. This assumption is not essential, its goal is only to keep the notation to a reasonable 
level of complexity. See nevertheless Remark [2] below. Moreover, we also assume that the 
actions of the individual players are also one-dimensional (i.e. ^ = R). Using lower cases for 
all the examples tackled below, the dynamics of the private states rewrite: 

dxi = [atxi + atxt + btai + Pt]dt + adWi, i = l,---,N, t e [0,T]; x^ = x^'^\ 

where x^^^ is a given initial condition in M and xt stands for the empirical mean: 

1 ^ 

i=l 

Similarly, the individual costs become: 

/(a)=E|^ ^[ntiaif + imtxi + mtXtf]dt + ^{qxfr + qxTfY i = '^r--,N. 
We will assume throughout the section that inf(g[o -j^] 6^ > and inf(g[o ■p] rit > 0. 

3.2 Implementing the MFG Approach 

As explained earlier, the first step of the MFG approach is to fix a flow fi = {iJ,t)o<t<T of 
probability measures in lieu of the empirical measures of the players' private states, and solve 
the resulting control problem for one single player. Since the empirical measure of the players' 
private states enters only the state equations and the cost functions through its mean, it is 
easier to choose a real valued deterministic function {'pt)o<t<T (which we should denote Ji in 
order to be consistent with the convention used so far in our notation system) as a proxy for 
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the mean of the empirical distribution of the private states. Then the individual stochastic 
control problem is to minimize 

J{a) = ^^j^ lY'^t +^i'nT'tXt + mtJltf]dt+^{qxT + qTlT)'^^ 
subject to the dynamical constraint 

dxt = [atxt + atjlt + hat + /3t]dt + crdWt, t G [0, T], 

with xo = x^^^ as initial condition. Given that the deterministic function t ^ Jl^ is assumed 
fixed, the Hamiltonian of the stochastic control problem is equal to 

H'^^^{t,x,y,a) = y[atx + atjlt + ha + A] + ^{mx + mtjitf + ^uto^ . (32) 

The optimal control a minimizing the Hamiltonian is given by: 

a{t,y) = -^y, t£ [0,T], y G M, (33) 

and the minimum Hamiltonian by 

1 1 6^ 

n^^{t,x,y) = inf H''t{t,x,y,a) = {atx + atflt + l3t)y + -{mtx + mt'Jltf - ^—y^- 
aeA 2 2nt 

Stochastic Pontryagin Principle Approach 

In the present situation, the adjoint equation reads: 

dyt = - [atyt + mt{mtXt + mtjit)]dt + ztdWt, yr = q{qxT + qV-t)- (34) 

Notice that because of the special form of the coefficients, this BSDE does not involve the 
control at explicitly. By ([33]), the optimal control has the form at = —n^ hvt, so that the 
forward dynamics of the state become: 

dxt = [atXt + atjlt -yt + l3t]dt + adWt, xo = x^^^ , (35) 

nt 

and the problem reduces to the proof of the existence of a solution to the standard FBSDE 
(p^ - ([35]) and the analysis of the properties of such a solution. In particular, we will need to 
check that this solution is amenable to the construction of a fixed point for JI. Notice that, 
once this FBSDE is solved, (j33p will provide us with an open loop optimal control for the 
problem. 

In a first time, we remind the reader of elementary properties of linear FBSDEs. We thus 
streamline the notation and rewrite the FBSDE as 

(dxt = [atXt + btyt + ct]dt + adWt, xo = x(°), ^^^^ 
[dyt = [mtxt- atyt + dt]dt + ztdWt, yT = c\XT + x, 

where in order to better emphasize the linear nature of the FBSDE we set: 

at = at, bt = -bf/nt, Ct = A + atjlt, m = -m^, dt = -mtfntjit, q = q"^, t = qqJiT- 
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Linear FBSDEs of the form (j36p have been studied in |23tl24j. but for the sake of completeness 
we construct a solution from scratch recalling some basic facts of the theory. Because of the 
linearity of the system (|36|) . we expect the FBSDE value function to be affine in the space 
variable, so we search for deterministic functions rjt and xt such that 

yt = ritxt + xt, te[0,T]. (37) 

Computing dyt from this ansatz using the expression of dxt from ([36|) we get 

dyt = Km + atrjt + btrjt)xt + Xt + hVtXt + Ctrjt\dt + arjtdWt, t G [0, T], 

and identifying term by term with the expression of dyt given in ()36p we get: 

Vt = -bttlt - 2atrjt + mt, vt = q 

Xt + {at + btr]t)xt = '0t- CtVt, XT = t (38) 
zt = crr]f 

The first equation is a Riccati equation. According to the classical theory of Ordinary Differ- 
ential Equations (ODEs for short), its solution may be obtained by solving the second order 
linear equation 

-btOt + [bt - 2atbt]et + mtb^tOt = 0, 

with terminal conditions 6t = ^ and 9t = bxC], and setting r]t = {bt6t)~^dt, provided there 
is a solution t ^ 9t which does not vanish. In the framework of control theory, the first 
equation in (|38p can be also reformulated as the Riccati equation deriving from a deterministic 
control problem of linear-quadratic type. Namely, the adjoint system associated with the 
minimization of the cost function (with the same notations as in (I36p ) : 

rp 

^(l) = ^q# + ^ \[al-xntil]dt (39) 

over the control (af)o<t<T5 subject to 

dit = [atit - {~btf'^at]dt, t G [0,T]; = 

admits the first equation in (|38p as Riccati factorization. Here, q and — m are non-negative 
so that the minimization problem (I39p has a unique optimal path and the Riccati equation 
is solvable. (See Theorem 37 in [21j.) Therefore, the system (j38p is solvable. 

As a by-product, we deduce that the forward-backward system (j36p is solvable. By the 
stochastic minimum principle, the solution is even unique. Indeed, the forward-backward 
system (j36l) describes the optimal states of the stochastic control problem driven by the cost 
functional 



a(a) = E 



^q^T + ^ \ [a? - "^til - 2Mt] dt 



(40) 



2 

depending upon the control (Q;t)o<t<T, subject to 

dit = [atit-{-btf'^cxt + Ct]dt + cjdWt, tG[0,r]; = (41) 

Since the cost coefficients in Z are convex in x and strictly convex in a and since the drift 
in ()4ip is linear in x and a, the stochastic minimum principle says that the optimal control 



14 



must be unique. Uniqueness of the solution to follows directly. (See [20J.) Notice that 
the convexity of the cost functions holds in the original problem (|34H35p . 

Going back to the notations in ([M|) - ([35]) . we deduce that the system (p^ - ([35|) is always 
uniquely solvable when /I is given. The point is thus to construct a fixed point for /J = 
{'Pt)o<t<T- If a fixed point does exist, that is if we can solve the system ([3H) - (j35]) with the 
constraint /I^ = K{xt) for t G [0, T], then the pair (/I^ = E,{xt),yt = E,{yt))o<t<T solves the 
deterministic forward-backward system: 

dpt = [{at + at)jlt - -^Vt + Pt\dt, JIq = x''°\ ^^^^ 
dyt = -[atVt + m{mt + mt)jlt)]dt, yrp = q{q + q)jlj,. 



Conversely, if the system (j42p is solvable, then fj, satisfies the fixed point condition for the 
mean-field game of linear-quadratic type since the linear system 

1,2 

dxt = [atxt + atflt - -Mit + Pt]dt ^^^^ 

dyt = -[atyt + mt{mtxt + mtjitl^t, yrp = q{qxT + Wt)^ 

has a unique solution. Indeed, ([^3]) is of the same structure as (j36|) . with a = but with the 
right sign conditions for {b^ /nt)o<t<T, for {m^)o<t<T and for g^, which is enough to repeat 
the analysis from ([5T|) to (|41|) . 

Similarly, the key point for the solvability of (j42p is to reformulate it as the adjoint system 
of a deterministic control problem of linear-quadratic type. Specifically, the minimization 
problem of the cost function 

— 1 1 

J (a) = 7;(^Tq{q + q)iT + / n [e*"-*"? + e.tmt{mt + rnt)if] dt, 
^ Jo ^ 

subject to 

dit = [{at + at)it + hoit + f^i\dt and = exp ^- ^ Ugds 

admits as adjoint system: 

- 6^ _ 

d^t = [{at + at)^t - —Ct + Pt]dt 

etut (44) 

dCt = -[etmt{mt + mt)it + {at + at)Ct]dt, Ct = eTq{q + qj^T- 

Clearly, (^,C) is a solution of (fHI) if and only if (.^,e~^C) is a solution of (H2]) . Unique 
solvability of (j44p is known to hold in short time. In the case when q{q + > and 
infjg[o,T] ["^tC"^* + "^i)] ^ 0, the optimization problem has a unique optimal path over any 
arbitrarily prescribed duration T and the associated adjoint system (j44p is uniquely solvable 
in finite time as well. In both cases, the fixed point for /I does exist and is unique: 

Theorem 3.1. On the top of the hypotheses already stated in Subsection \3.1\ assume that 
q{q + ^) ^ and inf(g[o,T] ['^t("^t + ^t)] ^ 0; then, given an initial condition for x, there 
exists a unique continuous deterministic function [0, T] B t ^Jlf. such that system p4p -p5 p 
admits a unique solution {x,y) satisfying Ji^ = E(xt) for any t G [0,T]. 
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Remark 2. We claimed that limiting ourselves to the one dimensional case d' = 1 was 
not restrictive, but it is only fair to mention that, in the higher dimensional framework, the 
solution of matrix valued Riccati 's equations is more involved and requires extra assumptions 
and that the change of variable (^,C) ^ {C^^T^C) used in (j44p to reformulate (j42p cannot be 
used in such a straightforward way. 

We just explained that a fixed point for Ji could be constructed by investigating the 
unique solvability of a linear forward-backward ordinary differential equation. Going back 
to the general discussion (|36 p ~ (|38p . the solvability of a linear forward-backward equation of 
deterministic or stochastic type (as in (j36p ) reduces to the solvability of a Riccati equation of 
the same type as the first equation in ([38]) . In practice, once {r]t)o<t<T in ([38]) is computed, 
the point is to plug its value in the third equation to determine {zt)o<t<T, and in the second 
equation, which can then be solved by: 

= te^t^i^u+buvn]du _ j^^^^ _ c^r,,]e^:[^u+buvu]du^g_ ^45^ 

(When the Riccati equation is well-posed, its solution does not blow up and all the terms 
above are integrable.) Now that the deterministic functions {r]t)o<t<T and {xt)o<t<T are 
computed, we rewrite the forward stochastic differential equation for the dynamics of the 
state using the ansatz ([37]) : 

dxt = [{at + btr]t)xt + btXt + Ct]dt + adWt, xq = x^^\ 

Such a stochastic differential equation is solved explicitly 

Jo Jo 

A Simple Example 

For the sake of illustration, we consider the frequently used example where the drift b reduces 
to the control, namely b{t, x, a) = a, so that at = at = (3t = 0, bt = I and the state equation 
reads 

dxt = atdt + adWt, t £ [0, T]; xq = x^°\ 

We also assume that the running cost is simply the square of the control, i.e. f(t, x, ji, a) = 
and nt = 1 and mt = Wit = 0. Using the notations and the results above, we see that 
the FBSDE of the MFG approach has the simple form 

{dxt = —ytdt + adWt, , 
' t G [0,r]; xo = xW, yr = qxr + x, (47) 

dyt = ztdWt, 

which we solve by postulating yt = rjtXt + Xt, and solving for the two deterministic functions 
rj an x- We find: 

= q ^ t 

l + q(r-t)' ^* l + q(T-t)' 

(keep in mind that q > so that the functions above are well-defined) and plugging these 
expressions into (|l6|) we get 

^(o)l + q(r-t) xt , , /■* dWs 



Xt = ' ^ + a[l + q{T - t)] j-^ -. (45 
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Notice further that the opthnal control at and the adjoint process yt satisfy 



^t-yt- i + ciiT-tf'^ l + qiT-t) 

and that the only quantity depending upon the fixed mean function t ^ Jl^ is the constant 
r = qqjij', which depends only upon the mean state at the end of the time interval. Recalling 
that q = q^, this makes the search for a fixed point very simple and one easily check that if 

2.(0) 

= 1^ ( ^-.^ (49) 
l + q{q + q)T 

then the mean at time T of the random variable xt given by (I48p is /I-p. 

Remark 3. From (|49p . we deduce that a fixed point does exist (and in such a case is unique) 
if 1 + q{q + q)T > 0. In the case when q{q + q) > 0, this is always true, as announced in 
Theorem \3.1[ If q{q + q)<0, the condition is satisfied ifT is small enough only. 



4 Control of Mean Field LQ McKean Vlasov Dynamics 

As before, we restrict ourselves to the one-dimensional case. The problem of the optimal 
control of the Linear Quadratic McKean- Vlasov dynamics consists in the minimization of the 
functional 



J{a) = E 



/ [-{mtXt+mtE{xt}) + -ntaf]dt + -{qxT + q'E{xT})' 



(50) 



over all the admissible control processes a = (at)o<i<T under the constraint 

dxt={atXt^at'&{xt}^htat^fi?\dt^adWu te[0,T]; xq = x^^^. (51) 

As in the case of the MFG approach, we assume that all the coefficients Of, at, 6f, /3j, mt, 
mj, nt are deterministic continuous functions oft G [0,7"], and that q and g are deterministic 
constants. We also assume the function nt to be positive. So in the notation of the Pontryagin 
principle introduced earlier (recall Theorem 12. ip . we have 

i^{x) = 7(2;) = C,{x) = X, b{t, X, x', a) = atx + atx' + ha + f^t, 

f{t,x,x',a) = ^{mtx + mtx'f + ^uta'^, g{x,x') = ^{qx + qx'f, 

for (t, X, x') € [0, T] X M X M, so that the adjoint equation becomes 

dyt = - [atyt + mt{mtxt+mt^{xt])\dt - \at^{yt] + mt{rnt + mt)¥.{xt]\dt + ztdWt, ^^^^ 
VT = q{qxT + qE{xt}) + q{q + q)^{xT}. 

Notice that because of the special form of the coefficients, this BSDE does not involve the 
control at explicitly. The control appears only indirectly, through the state xt and its mean 
E{xt}. Notice also that the sign condition on the derivatives d^^if and d^ig in (A2) in Theorem 
12. H is not satisfied since these two are linear functions. A careful look at the proof of Theorem 
12.11 shows that the sign condition is actually useless when the functions 7 and Q are linear. 
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as it is the case here. In order to apply the Pontryagin minimum principle, we guess the 
candidate for the optimal control by minimizing the Hamiltonian: 

a(t,x,x' ,y) = argmin X, x', y, q), 

= argminiy [ofX + atx' + bta + jSt] + ^(rrifX + fntx')'^ + ^n^a^ I. 
osR 2 2 j 

Notice that, here, there is one and only one variable x' in the Hamiltonian and not two 
variables x'l and X2 as in (j27p . The reason is that the functions ip, 7 and C coincide. The first 
order condition gives: 



ybt + nta{t, x, x',y) = 0, {t, x, x' , y) £ [0, T] x 



so that we choose 



at 



nt 



-yt, 



[o,r]. 



(53) 



as candidate for the optimal control. With this choice, the dynamics of the optimal state 
become: 



dxt = \atxt + atE{xt} -yt + A] dt + adWt, t g[0,T]; xq = x 

nt 



(0) 



(54) 



and the problem reduces to the proof of the existence of a solution to the following FBSDE 
of the McKean-Vlasov type (and to the analysis of the properties of such a solution): 



52 

dxt = \atxt + at'E{xt} -yt + A] dt + adWt, 

nt ' 

dyt = 



xo = x(°). 



- [atyt + mt{mtxt + mfE{xt})] dt 

-[atE{yt} +mt{mt + mt)E{xt}]dt + ztdWt, 

, yr = q{qxT + qE{xT}) +q{q + q)E{xT}■ 



{55) 



Since these equations are linear, we can solve first for the functions t ^ IE{xj} and t ^ E{yj}. 

Indeed, taking expectations on both sides of (j55p and rewriting the resulting system -using 
the notation xt and Ijt for the expectations E{xt} and E{yt} respectively-, we obtain: 



[{at + at)xt -yt + (3t]dt, 

nt 



Xq = X^ 

[{at + at)yt + {nit + mtf'xt] dt, yj, = {q + qfxT- 



(56) 



This system is of the same type as (j36p . with cr = therein, but with the right sign conditions 
for (bf)o<t<T, (tTit)o<t<T and q. In particular, we know from the previous analysis ([37MT]) 
that ([56]) is uniquely solvable. The solution has the form y^ = rj^xt + Xt foi' two deterministic 
functions and Xt satisfying 



Xt 



—rjj - 2{at + at)r]t - {mt + rrit)'^, 
nt 



[—Vt - (at + at)]xt = -Pfnt, 
nt 



riT = iq + qf 

XT = 0. 



(57) 
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Once rj^ is computed, we plug its value in the second equation in ([FTj) . which can then be 
solved as in ()45p . The deterministic functions rjf- and Xt being computed, we solve for xt and 
by plugging our ansatz = fj^xt + Xt ™to the first equation of , as done in . 

We can now replace the expectations IE{x(} and ]E{yt} appearing in the FBSDE (j55p of 
the McKean-Vlasov type by the deterministic functions xt and obtained by solving the 
forward backward system of ODEs (|56p . and solve the linear FBSDE 

dxt = [atxt + btyt + Ct]dt + adWt, xq = ^^^^ 
dyt = [mtxt - atyt + ()t]dt + ztdWt, yr = qxr + r, 

where for the purpose of this part of the proof, and in order to lighten the notation, we use 
the same kind of notation we used in the case of the MFG approach by setting: 

at = at, bt = -bf/rit, Ct = f^t + atXt, 

mt = -mf, dt = -mt{2mt + mt)xt - atyt, (\ = 0^, ^ = q{2q + q)xT. 

Here as well, we are in the framework of equation (|36p . with the right-sign conditions, so that 
equation ([58]) is uniquely solvable. The pair process (E{rc(}, E{y(})o<i<r is then the solution 
to an ordinary forward-backward system, which is also of the same type as (j36p : the solution 
must be the pair {xt,yt)o<t<T so that (j55|) is uniquely solvable. Clearly, the FBSDE value 
function is affine, that is 

yt = VtXt + Xu 0<t<T, (59) 

for some {r]t)o<t<T and {xt)o<t<T given by a system of the same type as (|38l) . We claim: 

Proposition 4.1. Under the current assumptions, for a given initial condition for the forward 
equation, the FBSDE (I55p is uniquely solvable. 

Once {r]t)o<t<T and {xt)o<t<T in dSS]) are determined by solving the corresponding Riccati 
equation, we rewrite the forward stochastic differential equation for the dynamics of the state: 

dxt = [{at + btvt)xt + btxt + Ct]dt + adWt, t e [0, T]; xq = x^^\ 

which is solved explicitly, as in ()46p . 

Notice that (|46p shows that the optimally controlled state is still Gaussian despite the non- 
linearity due to the McKean-Vlasov nature of the dynamics. While the expectation E{xf} = xt 
was already computed, expression (|46p can be used to compute the variance of xt as a function 
of time. Because of the linearity of the ansatz and the fact that rjt and Xt are deterministic, 
the adjoint process {yt)o<t<T is also Gaussian. 

Remark 4. Using again the form of the ansatz, we see that the optimal control at which was 
originally identified as an open loop control in ([53]) . is in fact in closed loop feedback form 
since it can be rewritten as 

at = VtXt Xt (60) 

nt nt 

via the feedback function </>(t, ^) = —bt{rjt(, + Xt)/nt which incidently shows that the optimal 
control is also a Gaussian process. 
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Remark 5. The reader might notice that, within the linear- quadratic framework, the condi- 
tions for the unique solvability of the adjoint equations are not the same in the MFG approach 
and in the control of MKV dynamics. On the one hand, optimization over controlled MKV 
dynamics reads as an optimization problem of purely (strictly) convex nature, for which exis- 
tence and uniqueness of an optimal state is expected. On the other hand, the optimal states in 
the MFG approach appear as the fixed points of a matching problem. Without any additional 
assumptions, there is no reason why this matching problem should derive from a convex po- 
tential even if the original coefficients of the game are linear- quadratic. In Theorem \3.1{ the 
sign conditions on q{q + q) > and infjgp.T] ['^il"^* +rnt)] > are precisely designed to make 
the matching problem (jl3|) be (strictly) convex. 



Simple Example 

We consider the same example as in the MFG approach where b{t, x, /i, a) = a, so that 
at = at = f3t = 0, bt = I so that the state equation reads 

dxt = atdt + adWt, t £ [0,T]; xq = x^^\ 

and f{t, x, a) = so that nt = 1 and mt = mt = 0. Using the notation and the results 

above, we see that the system has the same form as (jl7|) . but (q,r) in ([17|) is now given 
by q = and r = q{2q + q)K{xT}- Postulating the relationship yt = r]tXt + Xt and solving 
for the two deterministic functions ry and x, we find the same expression as in ()48p : 

(o)l + q(r-t) xt , /•* dWs 

Xt = X^ '■ I -II . -irr^ 



+ a[l + q{T-t)] - 



1 + qT l + qT ' " Jq 1 + q(T 

This makes the computation of E{xr} very simple. We find 

2;(0) 

"^^"^^ = i + i. + wr 

which always makes sense and which is different from (|49p . 



5 Further Examples 

In this section we provide simple examples which do not fit in the linear quadratic mold of 
the previous sections, but still allow for an explicit comparison of the solutions of the two 
problems. Surprisingly, we shall see that these solutions do coincide in some special cases. 



5.1 Some Simple Particular Cases 

As in previous sections, we discuss the case when b{t,x, ^,a) = a and f{t,x,^,a) = o? j2. 
When the terminal cost is of quadratic type, the previous analysis shows that, in both cases, 
the optimal control is simply the negative of the adjoint process, the difference between the 
two cases lying in the shape of the terminal condition of the associated FBSDEs. We discuss 
now several related forms of the function g for which we can still conclude. 

The case g{x,fj,) = rxji, r G M*. Here we only compare the end-of-the-period means. We 
know from Section [3] that the solution(s) to the MFG problem are given by the FBSDE ()47p 
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with the terminal condition q = and r = rfirp^ so that, from (|48p . the fixed point condition 
reads 

jlr^ = - rJlj^T, i.e. (1 + rT)jIrp = x^°'> , (61) 

which says that there exists a unique fixed point when 1 + rT ^ 0. If 1 + rT = 0, then there 
is no fixed point if x^^^ ^ and there are infinitely many if x^"-* = 0. Given /J^ as in ()6ip . 
we know from the analysis in Section [3] that (j47p has a unique solution {xt,yt, zt)o<t<T- By 
(j^Hj) . it must hold EjxT} = Jirp, so that {xt,yt, zt)Q<^t<x is the unique equilibrium of the MFG 
problem. 

When dealing with the control of MKV dynamics with the same choice of coefficients, 
the story is rather different. Indeed, the terminal condition g is not convex in the variables 
{x,x'), so that Theorem 12.11 does not apply anymore. From [l], we know that the FBSDE 
(I24p just provides a necessary condition for the optimal states of the optimization problem. 
We thus investigate the solutions to (j24p as possible candidates for minimizing the related 
cost. As above, p4p reduces to (I47p . with the terminal condition q = and r = 2rK{xT}- By 
(j48p . the necessary condition for E{j;x'} reads 

E{xt} = - 2rE{xT}r, i.e. {1 + 2rT)]lT = x^'^\ 

which says that there exists a unique fixed point when 1 + 2rT ^ 0. When 1 + 2rT = 0, there 
is no fixed point unless x^^^ = 0. Moreover, the optimal states, when they do exist, must 
differ from the optimal states of the MFG problem (unless x^^'^ = 0). 

The case g{x,fj,) = rxji^, r G M*. We argue as above. In the MFG framework, the fixed 
point condition also derives from the FBSDE (|47p . with the terminal condition q = and 
r = rji^, and from (jl8j) . It reads 

jj^ = - rT/I^, i.e. rT]l'^ +T^t - 2^^°^ = 0. (62) 

By the same argument as in the previous case, given a solution to (162p . there exists a unique 
solution to (j47p . In particular, the number of solutions to the MFG problem matches the 
number of solutions to (I62p exactly. 

For the MKV control problem, the FBSDE (j47p . with the terminal condition q = and 
r = 3rK{xT}'^, just provides a necessary condition for the existence of optimal states. By 
(|i8|) . ^{xt} must solve: 

E{xt} = - 3r^E{xT}^ i.e. 3rTE{xT}'^ + E{xt} - x^^'^ = 0. (63) 

Existence of a solution and uniqueness to the second-order-equations (f62]) and (j63|) depend 
upon the values of the parameters r, T, and x^^\ Let us have a quick look at these issues 
in more detail. Let S'mfg = {{T,x) G M+ x M; 1 + 4rrx(°) > 0} and 5mkv = {(r,x(°)) G 
M-i- X M; 1 + 12rr3;*^''^ > 0}. These represent the sets where the second-order-equations are 
solvable. It is easy to see that S'mfg C Smkv when rx^^^ > 0, with the converse inclusion 
when rx^^^ < 0. Moreover, there exists a continuum of triplets {r,T,x^^^) for which there is 
at most one solution to the MKV control problem while there are exactly two solutions to the 
MFG problem. In particular, if r = —1, T = 1, x^^^ = 1/12, then we have (1,1/6) G ^mkv 
and (l,(3±^/6)/6) G ^mfq. 
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General Linear Terminal Cost. The two previous examples can be easily generalized to 
the case when the terminal cost is of the form g{x,ii) = x^(ji). In the MFG approach, the 
fixed point equation for the mean has the form (compare with (|61|) ): 

For the MKV control problem, it reads 

¥.{xt] =x^^^ -T[i{¥.{xT])^{xT] + l{HxT])]. 

Quadratic Terminal Cost. When g{x,jL) = x'^'yijj), we deduce, by choosing q = 27(71) and 
r = in (|47p . that, in the MFG approach, the fixed point equation for the mean has the form 

7x^(1 + 2r7(7ir)) =a;(°). 

For the MVK control problem, the analysis of the adjoint equation cannot be tackled by 
investigating the mean of xt first since the terminal condition depends upon the expectation 
of the square of xt- 

5.2 Additive Running Cost 

Here we tackle two examples when b{t, x, fJ,,a) = a and / / 0. In both cases, the terminal 
cost g is set equal to 0. 

The case f{t,x,fj,,a) = o? 12 + x/I. In the MFG approach, we can write the equivalent of 
the averaged forward-backward differential equation (jl3]) . We get as a fixed point condition 
for the mean {'p^)Q<t<T- 

This reads as the second-order ODE Ji^ =jLf = 0, with JIq = x^"-* and Jlj- = 0, the solution of 
which is given by 

+ e- 

As in the simple case investigated in the previous subsection, once (7lJo<i<T is computed, 
the solution to the corresponding FBSDE (jl9p exists and is unique, since the FBSDE is then 
of the same form as (j36p . 

For the MKV problem, convexity does not hold, so that the stochastic minimum principle 
provides a necessary condition only. Here it has the form: 

J dE{xt} = -E{yt}dt, , ^ m t^i ii^r t (0) ^r- 1 n 

[ dE{yt} = -2E{xt}dt, 

Again, the above system reduces to a second-order ODE, the solution of which is: 

The linear-quadratic case with zero terminal cost. We finally go back to the LQ 
framework. As above, we choose 6(t, x, /i, a) = a and g = Q. For the running cost, we 
consider the classical case where f{t,x,fi,a) = a^/2 + (x — 7^)^/2. Here it is easy to see 
that, in both cases, the solution to the adjoint backward equation has zero mean, so that the 
expectation of any optimal states must be constant. 
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5.3 A Simple Model for Emissions Regulation 

For the sake of motivation, we present a toy model of the simplest form of Green House 
Gas (GHG) emissions regulation. Our interest in this model is the fact that the mean field 
interaction between the players appears naturally in the terminal cost function. 

A set {1, . . . ,N} of firms compete in an economy where green house gas emissions are 
regulated over a period [0, T], for some T > 0. For each firm i, we denote by XI the perceived 
value at time t of what its own cumulative emissions will be at maturity T, and we assume 
that its dynamics satisfy 

dXf = {bi - ai)dt + aidWt, t e [0, T]; X'q = x^°\ (64) 

where assumptions on the individual emission rates bl and the volatilities cr} can be chosen 
as in [7] for the sake of definiteness. The process {al)o<t<T represents the abatement rate 
of firm i, and it can be viewed as the control the firm can exert on its emissions output. In 
the case a} = 0, the process (A'j*)o<t<T gives the perceived cumulative emissions of firm i 
in the absence of regulation, situation which is called Business as Usual, BAU in short. At 
the start t = of the regulation implementation period, each firm i is allocated a number 
Aj of permits (also called certificates of emission, or allowances). The cap A^^^ = X^^^ Aj 
can be viewed as the emissions target set by the regulator for the period [0,T]. If at the 
end of the regulation period [0, T], the aggregate emissions in the economy exceed the cap, 
i.e. if Xli^i -^T ^ A^'^^ , then each firm i has to offset its emissions X^ (expressed in CO2 
ton equivalent) by redeeming one permit per ton, or by paying a penalty A for each ton not 
covered by a certificate. In other words, firm i has to pay 

X, = XiX^r - Ai)+l(A(iv),^)(j;X;^) (65) 

where we use the notation = max(x, 0) for the positive part of the real number x, and 1a 
for the indicator function of the set A. The penalty A is currently equal to 100 euros in the 
European Union Emissions Trading Scheme (EU ETS). 

Remark 6. In the case of cap-and-trade regulations, a market on which emissions certificates 
can be bought and sold is also created as part of the regulation. In order to avoid penalties, 
firms whose productions are likely to lead to cumulative emissions in excess of their initial 
allocations Aj may engage in buying allowances from firms which expect to meet demand 
with less emissions than their own initial allocations. For the sake of simplicity, the present 
discussion concentrates on the cap part, and disregards the trade part of the regulation. 

In search for an optimal behavior, each firm needs to solve the following optimization 
problem. If we assume that the abatement costs for firm i are given by a function : M — )• M 
which is and strictly convex, is normalized to satisfy c*(0) = mine* , and satisfies Inada- 
like conditions, (c*(x) = /Jlxl^"*"" for some /3 > and a > is an example of such a cost 
function), and if its abatement strategy is q*, its terminal wealth is given by 

W^ = w'- £ c^aDdt - \{Xir - A,,)+l(A(iv),^) 4) , 
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where w'^ stands for the initial wealth of the firm i. Recall that, in our simple model, firms 
do not trade allowances. Now if we view the perceived emissions XI as the private state of 
firm i, and if we assume that each firm tries to maximize its expected terminal wealth, or 
equivalently minimize the objective function 



J' {a) 



E 



N 



(66) 



where we set at = {a\, • • • , a^), then we have formulated our emission regulation model as 
a stochastic differential game. Notice that, in this particular example, the equations (j64p 
giving the dynamics of the private states XI are decoupled. However, equations (f66l) giving 
the expected costs to minimize are coupled in a very specific way, namely through the average 
of the values of all the private states. Recasting the model in the framework of Section [2l we 
see that the drift and running costs of the private states XI are of the form: 



X, /i, a) 



-a. 



and 



f{t,x,fi,a) = c(a). 



For the sake of simplicity we assume that the BAU drift bl is zero, even if, from a modeling 
point of view, this implies that the processes X* can take negative values. We refer the reader 
to [7] for a specific discussion of the positivity of the emissions in similar models. The terminal 
cost is given by the function g defined on M x 'Pi(M) by 

g{x, /i) = X{x - A)+1{^>A} 

where A = A^^^/N stands for the cap per firm, which is assumed to be independent of N. 
The quantity A is the relevant form of the cap as it makes sense in the limit — )• oo of a 
large number of firms treated similarly by the regulator. 



The Mean Field Game Approximation 

If we assume quadratic abatement costs c(a) = then the approximate Nash equilibriums 

provided by the MFG approach are given by the solutions (xt, yt, zt)o<t<T of the FBSDE ()17|) . 
with as initial condition and terminal condition 



VT - Al{2.j,>A}l{^^>A}, 

satisfying the matching problem EjxT} = /Uy, that is 

E{xt} = x(o) - rAP{xT > A}l|E{x^}>A}- (67) 

We refer the reader to [7J for the precise derivation of the terminal condition despite the 
singularity at A. Because of the nonlinearity of the terminal condition, the matching problem 
is much more involved than in the linear-quadratic setting. Precisely, the simultaneity of the 
nonlinearity of the terminal condition and of the strong coupling between the forward and 
backward SDEs makes the fixed point equation rather intricate. In the specific situation when 
the running cost is quadratic, as it is here assumed to be, the forward and backward equations 
can be decoupled by a Hopf-Cole transformation which suggests to work with an exponential 
of the solution of the Hamilton-Jacobi-Bellman equation. This transformation also applies 
to the FBSDE by differentiating the value function of the control problem. Remember that 
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the value function of the FBSDE coincides with the derivative of the value function of the 
control problem. Therefore it is simpler to investigate the Hamilton-Jacobi-Bellman equation 
directly. In the present set-up, the forward-backward system of nonlinear PDEs describing 
the MFG problem reads (compare with (jl6H17p ): 

(Kolmogorov) dtm — —dl^m + dx{-dxvm) = 0, m(0, •) = 6^(o), 

(HJB) dtV + ^dl,V-^{d,vf = 0, v{T,-) = X{--A) + l{rnr>A}, 

where the function m(t, ■ ) is understood as the density of the law of Xt in (I19|) (or equivalently 
as the density of the fixed point {^t)o<t<T of the matching problem), so that: 

rnt=]It= / xm{t,x)dx, t G [0,T], 

and V as the HJB value function of the stochastic control problem when the family {fit)o<t<T 
is frozen. These PDEs are simple enough to be solved by a Hopf-Cole transformation: 

2 , / ^ \ , , 

V = —a log u ^ u = exp I t:] ] m = utp ^ ip = — . 

V o"^/ u 

Then the pair (n, ip) solves the simple forward-backward system: 

dtu + —dlxU = 0, u{T,-) = exp[-Acr~2(- - A)+l|^y>A}] , 

consisting of two fundamental heat equations coupled only at the end points of the time 
interval. 

Now the specific and simple form of the mean-field interaction term rriT (depending only 
upon the global distribution's first moment) leads us to distinguish three cases: 

The Business As Usual (BAU) Solution. If we assume that the cap is not reached, then 
n(T, y) = 1 for all y G M, which implies that, for all (i, y) G [0, T] x M, we have u{t, y) = 1 
and thus v{T, y) = 0. In this framework, the optimal abatement strategy is to do nothing, 
i.e. at = 0. This corresponds to BAU. So, in this case, m{t,y) = ^^(o) a2^(y), the density of 
the Gaussian distribution with mean x^") and variance a^t. The fixed point condition for the 
Nash equilibrium is then satisfied if: 

/ yip^io) ^2t{y)dy < A i.e. x(°) < A. 
Jr 

The Abatement Solution. Suppose now that the cap is exceeded. In this case the terminal 
condition has the form: v{T,y) = X{y — A)+. An easy computation shows that 

v{t,x) = -a2log(^^99,,,2(r_t)(2/)exp[-Aa-2(y- A)+]fiy^, {t,x) G [0,r] x M. 

The optimal feedback is given by dxv{t,x), which satisfies the viscous Burgers equation: 

dt{dxv) + —dlx{dxv) - dxvdx{dxv) = 0, dxv{T, ■) = A1|.>a}. 
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By the maximum principle, it holds < d^v < A. Therefore, when x^'^^ > A + AT, the optimal 
path {xt)o<t<T satisfies at maturity: 

xr > A + (jWt, (68) 

so that friT > A, which guarantees that the fixed point condition for the Nash equilibrium is 
satisfied. 

Critical Case. If x^^^ belongs to the interval (A, A + XT/2), existence of the fixed point 
may fail. Indeed (j67p says that a necessary condition for the existence of a fixed point is 
E(xt) > A, so that x^^^ must satisfy 

> E{xt) + XTF{xT > A}. 

Following the proof of ([68]), we see that xt > - AT + aWx > A - AT + aWr- Therefore, 

x(°) > E(xt) + XTF{aWT > AT} > A + ATPjcrWr > AT}. 

When a tends to +oo, the inequality degenerates into 

x(°)>A+^, 

so that, when x^^^ < A + XT/2, existence of the fixed point fails for a large enough. 

When > A + XT/2, we must refine the previous argument in order to settle the 
question of existence of a fixed point. The difficulty is to compute F{xt > A}. Using the 
expression of m in terms of ■0) we see that: 



"{xT > A} 



+ 00 



A 



+ 00 

u{T,y)(p^{o)^^2'r{y)dy 



Moreover, 

" + 00 



A 

+ 00 



r+oo -I 

= (27r)-i/V-ir-V2 / exp[-Aa-2y--a-2r-i(y-(x(0)-A))']dy 
Jo 2 

= exp [Aa-2 (AT - 2(. (0) - A)) /2] <^> C Jy, ) > 

where <I> stands for the cumulative distribution function of the standard Gaussian distribution, 
so since 

n(T, y)<^,(o),,2r(y) = n^Wr < A - x^")} = <J'(^^) , 

we deduce 

P|,,>A|- exp[-A.-^.]^[.-^T-V^(-AT/2 + .)] 

^ ^ exp[-Aa-2<5]$[a-iT-V2(_AT/2 + <5)] +<I>[a-iT-i/2(_AT/2-,5)]' ^ ' 
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where we have set 6 = x^'^^ — A — XT/2. 

When 5 = 0, F{xt > A} = 1/2 so that E{xt} = A. This suggests that x(°) = A+Ar/2 is a 
critical initial condition for the existence of a Nash equilibrium. This can be proven rigorously 
when a is large. Keep in mind the fact that existence then fails for x^^^ < A + XT/2. Indeed, 
letting a tend to +oo in (f5U|) . we obtain 

F{xT > A} ^ ^ 

so that, by ()67p . > A for a large, and the fixed point condition holds. 

An intuitive explanation why the fixed point condition can fail goes as follows. In a 
reasonable (from the point of view of the choices of A, A and T) cap-and-trade scheme, the 
cap is expected to be reached to incentivize the implementation of abatement strategies. 
However, it is often observed that the cap is not reached at the end of the period. From the 
MFG theory viewpoint, the argument is as follows. An individual firm with negligible impact 
on the overall emissions can emit whatever it wants without impacting significantly the global 
emissions. However, as soon as this becomes everybody's strategy, the cap is reached. 

Search for Cooperative Equilibriums 

From an economical point of view, one might think that the search for an approximate Nash 
equilibrium is well-adapted to the case when the firms decide of their own strategy without 
any significant effect on the strategies of the others. Even if this situation sounds reasonable, 
the reader might wonder about the case when some of the firms obey a similar policy. Part of 
the answer is then given by Subsection 12.31 Indeed, in the extremal situation when the all the 
firms follow the same general strategy, equilibriums are expected to be of the cooperative type, 
as described in Subsection 12.31 As the terminal cost function g is non-convex with respect 
to the expectation Ji, the stochastic Pontryagin principle then provides necessary conditions 
only for the optimal strategies. Within this framework, it is then quite remarkable that the 
forward-backward equation (I24p matches (1190 excatly, provided that the expectation of the 
emissions at maturity is different from the cap A. (So that both equations admit the same 
solutions.) The reason why they match is as follows. If the expectation of the emissions at 
time T differs from the cap, any slight perturbation of the overall strategy keeps the final 
state of the system unchanged: the cap remains either exceeded or respected. Therefore, the 
search for a local cooperative equilibrium leads to the same limit optimization problem as the 
search for a local Nash equilibrium, which means that the necessary conditions deriving from 
the stochastic Pontryagin principle are the same. The case when the mean of the emissions 
at maturity fits the cap exactly is much more involved and goes far beyond the scope of the 
paper: as shown in [7], the collateral effect of the cap onto the dynamics of the carbon market 
is highly singular in the vicinity of the cap when the market has some degeneracy. So is the 
case here as the dynamics for the mean are completely degenerate. (The previous paragraph 
shows that the dynamics for the mean are related to the inviscid Burgers equation.) 

6 General Solvability Results 

We argued that optimal paths to both MFG and MKV control problems appear as realizations 
of the forward component of the solution of systems of forward-backward stochastic equations 
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of MKV type, though the correspondmg forward-backward systems are different. On the one 
hand, the forward-backward system of the MFG approach appears as a standard forward- 
backward system derived from the classical form of the stochastic Pontryagin principle, a 
nonlinear term of MKV type appearing as a result of the fixed point argument of the third 
step of the MFG approach. On the other hand, as we explained in the special case of scalar 
interactions derived from [1], the forward-backward system associated with the MKV control 
problem involves additional terms coming from the need to optimize with respect to the 
interaction terms. 

In this section, we address the solvability of general stochastic forward-backward equations 
of MKV type. General solvability results are given and then specialized when the coefficients 
of the forward-backward systems derive either from a MFG or a MKV control problem. 
Results are stated in a pedagogical manner, almost as a reader's guide to MFG and MKV 
control problems. In particular, they cover, in a more abstract fashion, the results and 
computations performed in the linear-quadratic setting (see Sections [3l H]) and for the other 
examples discussed in Section O Due to the heavy technical nature of the results and their 
proofs, we only sketch proofs, all the arguments being detailed in the forthcoming works 
[HI m [5] by the two first named authors. While statements are given in dimension 1 only, 
higher-dimensional versions are available in O SI [5] . 

6.1 Solvability of General MKV Forward-Backward Stochastic Equations 

Forward-backward systems such as (jlSp and (j24p are here understood as special cases of more 
general fully coupled forward and backward stochastic differential equations involving the 
marginal distributions of the forward and backward solutions. Changing the notation ever 
so slightly in order to accommodate the FBSDEs appearing in both analyses, we consider 
equations of the form: 



with Yt = G{Xt, C{Xt)) as terminal condition. As already mentioned, X_ and Y_ are both 
one dimensional. 

Because of the coupled structure of the system, solvability is a hard question to tackle: 
fully coupled forward-backward systems are instances of stochastic two-point-boundary- value 
problems for which both existence and uniqueness are known to fail under standard Cauchy- 
Lipschitz conditions. When the coefficients do not depend on the marginal distributions of 
the solutions, the forward and backward equations may be decoupled by taking advantage of 
the noise. Indeed, when the noise is non-degenerate, it has a decoupling effect by regularizing 
the underlying FBSDE value function. We refer to [8] for a review of this strategy. Recall 
that by FBSDE value function, we mean the function u giving 1^ as a function of Xt, say 
u{t,Xt). When the coefficients B{t,x,y), F{t,x,y) and G{x) are independent of the measure 
argument, this value function satisfies the quasilinear PDF 



for t G [0,r] and x € M, with u{T,x) = G{x). 

When the solution interacts with itself, e.g. with its own distribution, the standard Markov 
structure breaks down. Indeed, the marginal distribution of the process 2L at time t is needed 



dXt = B{t, Xt,Yt, CiXt, Yt))dt + adWt 

dYt = -F{t, Xt, Yt, C{Xt,Yt))dt + ZtdWu 0<t<T, 



(70) 



dtu{t, x) + -a'^d1^u{t, x) + B{t, x, u{t, x))dxu{t, x) + F{t, x, u{t, x)) = 



(71) 
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to compute the transitions towards the future values of the state of the system. BasicaUy, 
the Markov property must be considered in a larger space, namely the Cartesian product of 
the state space of the forward process, which is M in the present situation, with the space of 
probability measures on the state space, which is infinite dimensional if the state space is not 
finite. Put it differently, the relationship between y and X is expected to be of the form: 



for some mapping u : [0,T] x M x T'i(M) ^ M. The PDE approach is then not sufficient 
anymore: the underlying heat kernel derived from the noise still has a smoothing property 
in the finite-dimensional component, but does not have any regularizing effect in the infinite- 
dimensional direction of u. The regularity of v in {t, x) can be tackled by the very simple 
observation: once the law of {2L:Y.) in (ITOh has been computed, (I70p reads as a Markovian 
FBSDE with u{t,x) = v{t,x,C{Xt)) as value function. 

In the forthcoming paper j4j, the solvability of the equation is tackled by a compactness 
argument and the Schauder fixed point theorem. Because of the expected relationship (j72p 
between Y_ and X, the conditional law of Yt given Xt is expected to be a Dirac measure 
for any t G [0,T]. In other words, the law of the pair {Xt,Yt) is expected to have the form 
ip{t, •) o fit, where <p{t, •) is a continuous mapping from M into M, /it is a probability measure 
on M and ip{t, •) o fit is the measure on M^, obtained as the image (push-forward) of the 
distribution fit on M by the mapping M 9 x ^ {x,ip{t,x)). Given an element fi of the space 
E = 'Pi(C([0, T])) of probability measures on the space C([0,T]) of real valued continuous 
functions on [0, T] and an element (p of the space F = Ch{[0,T] x M) of real valued bounded 
continuous functions on [0, T] x M, we consider the system 



with Yt = G{Xj', ^^t) as terminal condition. Here /u = {^J-t)o<t<T is the flow of marginal 
distributions where fit denotes the image (push-forward) of fi under the t-th coordinate map 
C{[0,T]) 3 w ^ wit) G M. If this system admits a unique solution, we can consider as output 
the measure C(X), which is an element of E, together with the FBSDE value function linking 
Y_ with X, which is an element of F under suitable conditions. The following result is proven 
in [4]: 

Theorem 6.1. Assume a > and that B, F and G are hounded Lipschitz continuous with 
respect to the space variable for the Euclidean distance, and with respect to the McKean- 
Vlasov component for the Wasserstein metric, uniformly in the other variables. Then, the 
mapping ^ : {E,F) 3 (/i, <^) i— )• (C(X),u) € E x F, where u is the FBSDE value function 
such that Yt = u{t, Xt) for any t G [0, T], has a fixed point. 

Recall that the Wasserstein metric between two probability measures rj and rj' on W^, 
d > 1, is the square root of the infimum of f^2d \z — z'\^d'K{z, z') over all the probability 
measures vr on M^*^ admitting r/ and n]' as marginals. 

Notice that given any (^u, ip) m. E x F , the forward-backward system is standard, so that 
unique solvability holds, see e.g. Delarue [8]. The proof consists in showing that <I> leaves a 
bounded closed convex subset T C ExF stable and that the restriction of $ to F is continuous 
and has a relatively compact range, ExF being endowed with the product of the topology 



Yt = v{t,Xt,C{Xt)), tG[o,r] 



(72) 



dXt = B{t,Xt,Yt,ip{t,-) o nt)dt + adWt 

dYt = -F{t, Xt,Yt, ifit, •) o fit)dt + ZtdWt, < t < T, 



(73) 
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of weak convergence of measures on E and the topology of uniform convergence on compact 
sets on F. Boundedness of the coefficients plays a crucial role to prove that the probability 
measures in the range of <I> are tight. Positivity of a ensures that the FBSDE value function 
in the range of $ are Holder continuous and thus live in a compact subset of F. A complete 
proof is given in [3]. 

The application of Theorem 16.11 to the solvability of the adjoint equations in (jl9p and 
(I24p is not so straighforward. Indeed, the coefficients B and F therein derive from a convex 
Hamiltonian structure, which means that they appear as derivatives with respect to the state 
space parameter (and also possibly with respect to the measure parameter) of the underlying 
Hamiltonian function. In practice, the Hamiltonian function is expected to grow at infinity 
at a rate which could be quadratic with respect to the state space parameter and the control 
variable, so that boundedness of the coefficients B and F is expected to fail. Similarly, the 
terminal cost g in ()19p and (j24p may have quadratic growth, so that the terminal condition in 
the adjoint equations is not bounded in practice. As a consequence, refinements of Theorem 
16.11 are necessary for our specific purposes. 

Here is a first straightforward refined statement of this kind: when the coefficient B grows 
at most linearly in (x, y) and C{Xt,Yt), and F and G are bounded, existence of a solution still 
holds true. When we say that B is at most of linear growth with respect to (x, y, C{Xt, Yt)), 
we mean that 



Solvability follows from a compactness argument again. Approximating the drift -B by a 
sequence of bounded drifts {Bn)n>i, we can prove that, for the corresponding solutions 
{{X^,Yl^ = X"))o<t<T)n>i, the distributions (>C(X"))„>i are tight and that the func- 
tions (ti"')n>i are equicontinuous on compact subsets of [0,T] x M. The proof follows from 
two key observations: first, since / and g are bounded, the functions {u"')n>i are uniformly 
bounded, so that the processes {Xy')n>i are tight; second, by smoothing properties of keat 
kernels, the functions (u")n>i are locally uniformly continuous, see [9]. Extracting a conver- 
gent subsequence, we can pass to the limit by using stability of forward-backward stochastic 
differential equations. 

Here is another straightforward refinement: when G is bounded, and F is bounded with 
respect to all the parameters but y, and has a linear growth with respect to y, the solvability 
still holds. Indeed, a standard maximum principle for BSDEs says that the process Y_ is 
bounded, the bound depending upon the bound of G and the growth of F only, so that F 
may be seen as a bounded driver. 



6.2 Solvability of the MFG Adjoint Equations 

In the MFG framework, the FBSDE system is given by (jl9p . With the notation ()70p . it reads 



where H and a are given by (jl2p and (jl3p . We emphasize that the dependence upon the 
McKean-Vlasov interaction is limited to the law of Xt only and that the coefficients do 




B{t, Xt,Yt, C{Xt, Yt)) = b{t, Xt, C{Xt), a^(^')(t, Xt, Yt)) , 
F{t, Xt, Yt, CiXt,Yt)) = {t, Xt, Yt, a^^^'\t, Xt, Yt)) 

G{Xt,/:{Xt)) =d,g{XT,CiXT)), 
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not involve the law of If. When d^g is bounded and d^H is bounded with respect to all 
the parameters but y, and is at most of linear growth in y, Theorem 16.11 applies directly, 
provided that the coefficients are Lipschitz continuous with respect to x, y and /i, the Lipschitz 
continuity with respect to ^ being with respect to the Wasserstein distance. (See [3l Section 
3] for the analytical counterpart of this result.) In particular, the minimizer a^*{t,x,y) must 
be Lipschitz continuous as well. In the examples tackled below, the Hamiltonian is always 
strictly convex in the variable a so that the regularity of the minimizer follows from the 
implicit function theorem. 

In practice, the cost functions g and H are expected to be of quadratic growth in x so 
that dxg and dxH cannot be bounded. The main point is thus to allow F and G to grow 
at most linearly in the statement of Theorem 16.11 The strategy developed in [6] consists in 
approximating the cost functions of the MFG control problem by a sequence of cost functions 
with bounded derivatives in x preserving the convexity of the Hamiltonian in {x,a). Here is 
the solvability result obtained in [6j: 

Theorem 6.2. Let us assume that 

(i) The cost function f is Lipschitz continuous in the variables x, and a on any balls 
of center and radius R, that is on sets where x and a are bounded by R and the second- 
order moment of /i is by bounded by R^, the Lipschitz constant being at most of linear growth 
in R. Moreover, f is twice- continuously differentiable with respect to x and a, with uni- 
formly bounded second-order derivatives (uniformly in {t,x, ^i,a)). The partial derivative 
daf{t,x,fj.,0) is uniformly bounded and the product xdxf{t,0,6x,0) is always non-negative. 
(Here, 6x denotes the Dirac mass at point x.) Moreover, the function f is convex in (x, a) 
for {t, /i) fixed in the sense that there exists a constant A > such that: 

f{t,x',ii,a) - f{t,x,i2,a) - {{x' -x,a - a), V(^.^„)/(x, /x, a)) > X\a - ap; 

(a) The terminal cost g is Lipschitz continuous in the variables x and /j, on any balls of 
center and radius R (in the same sense as f) and the Lipschitz constant is at most of linear 
growth in R. Moreover, g is twice- continuously differentiable with respect to x, with uniformly 
bounded second-order derivatives (uniformly in {x,fj,)), g is convex in x when fi is fixed and 
the product xdxg{0, 6x) is always non-negative; 

(Hi) b is affine in x and a in the sense that b(t,x, fi,a) = bo{t,fj,) + bi{t)x + 62(^)0 where 
bo, bi and 62 are bounded and bo is Lipschitz-continuous with respect to for the Wasserstein 
distance. 

Then, the adjoint equation (|19p has at least one solution. 

There are three types of assumptions in the statement of Theorem 16.21 First, by linearity 
of b in (x,a) and by convexity of / in (x,a), the Hamiltonian H is convex in (x,a). It is 
even strictly convex in a. Following ()19p . this says that the stochastic Pontryagin principle 
applies in the MFG procedure. Second, Lipschitz properties say that the coefficients are 
regular with respect to the parameters x, a and fi. Actually, by a careful inspection of (i) and 
{ii), the reader might notice that nothing is said about the regularity of dxH and dxg in the 
parameter whereas F and G are asked to be Lipschitz continuous in in the statement 
of Theorem 16.11 Indeed, as proven in [6|, only the regularity of the cost functions with 
respect to the parameter ji really matters within the framework of control theory. Third, the 
sign conditions on xdxf{t,Q,5x.,^) and xdxg{t,^,5x.,^) sound as mean-reverting assumptions 
of a weak type: when approximating the coefficients by bounded ones, the sign conditions 
guarantee that the expectations of the approximating solutions do not blow up. 
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In comparison with Section [31 we emphasize that the mean-field linear-quadratic mod- 
els investigated therein do not satisfy the boundedness condition required for bo in (iii). In 
the current setting, the boundedness condition ensures that the fixed point measures of the 
approximating MFG problems are tight. In the linear-quadratic framework, the tightness 
condition can be expressed differently, as the problem may be reformulated directly, by inves- 
tigating the solvability of associated Riccati equations. Moreover, the current sign conditions 
must be compared with the sign conditions in the statement of Theorem l3.11 In the framework 
of Theorem 13. 11 xdxg{0, 6x) = qqx^ and xdxf{t, 0, 6x,0) = mtmtx'^, so that the sign condition 
in Theorem 16.21 hold if q,q > and m, m > respectively. Keep in mind that we can always 
assume q > and m > 0. Clearly, this is stronger than the sign conditions q{q + q) > and 
inf(g[O^T] [mf(mt + rnt)] > required in Theorem 13. 1[ 

Sketch of Proof (Theorem \6.^) . As already announced, the proof consists in approximating 
/ and g by two sequences (/")n>i and {g"')n>i satisfying the same assumptions, but with 
bounded derivatives in x. The construction of the approximating sequence follows from 
arguments of convex analysis, which may be found in [6j. We use the convexity of / and g to 
express / and g as their own Legendre bi-conjugates and then truncate the dual variable in 
the Legendre representation. 

By Theorem 16. H we then expeclH to find, for each n > 1, a solution (X"",!^") to the 
adjoint equations associated with the approximate Hamiltonian 

F"'^(t,x,y,a) = 6(t,x,^,a)2/ + /"(t,x,^,a), t £ [0,T], x,y,aGR, fi eVi{R), 

and therefore with the coefficients B^(t, x,y, fj.) = b{t,x, ii,a"'{t,x,y, fi)), F^{t,x,y, ^) = 
dxf"'{t,x,n,a'^{t,x,y,fi)) + bi{t)y and C{x,fi) = dxg''{x,n), where 

a^{t,x,y,^) = argmin^giRi?"'''(t,x,7/,a). 

We also expect a'^{t,x,y, fi) to converge towards a{t, x,y, fi) as n — )• +oo. 

For n > 1, we denote by X"" the process controlled by (a" = a"(t, X", y^"-, £(X")))o<t<T 
and by it"- the FBSDE value function such that Y/^ = for any t G [0, T]. Our goal 

is to establish tightness of the processes {XJ^)n>i and relative compactness of the functions 
('ii"')n>i for the topology of uniform convergence on compact subsets of [0, T] x M. The first 
step is to prove that 



and the second step is to prove that the growth of the functions (u")„>i can be controlled, 
uniformly in n > 1. 

In order to prove (|74p . we take advantage of the Hamiltonian structure of the coefficients 
by comparing the behavior of X " to the behavior of a reference controlled process, driven by 
specific values of the control. Below, the generic notations for the reference controlled process 
and for the reference control are C/" and /3" respectively. Two different versions are considered 
for [/": they are driven by the controls (3"' = and = E(q;"))o<s<T5 respectively. For 
each of these controls, we compare the corresponding cost to the optimal one by using the 
stochastic Pontryagin principle and, subsequently, to derive some information on the optimal 
control (a")o<s<T- The starting point is to compare the behavior of X " to the behavior of 

■^We say "expect" only since we said nothing about the smoothness of dxH and dxQ with respect to fi. We 
refer to [6] for the complete argument. 




(74) 
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the process C/" controlled by the deterministic control = E(a"))o<s<T- As explained 
right below, the goal is to obtain a uniform bound for the L^-norm in time of the variance of 
(a")o<s<T- Indeed, because of the specific affine structure of b in (Hi), it can be checked that 
^{U") = lE(X^), for < s < r, and that sup„>i supo<s<T Var(C/]') < +00. By the stochastic 
Pontryagin principle and by the growth and convexity properties in (i)-(ni), we then claim 



< E 



g^U¥,C{X^))+ / r{s,U^,C{X^),0)ds 



from which we can derive (see [6]) 



sup 

n>l 



Var(d;")ds+ sup Var(X") 
o<s<r 



< c 1 + E 



1/2 



(75) 



Bound ([75]) says that (f74|) holds if the expectations of the variables (X")o<t<T are bounded. 
In other words, we can only focus on the expectations of the variables (X")o<t<T- 

The second step in the proof of (I74|) consists in comparing the behavior of X ^ to the 
behavior of the process controlled by the null control. Since no confusion is possible, we still 
denote by C/" the process controlled by the null control /3" = 0. By boundedness of 60 in (^^^)) 
it is easily checked that sup„>;^ E[supQ<g<y |f/"P] < +00. Applying the stochastic Pontryagin 
principle again together with Jensen's inequality. 



5"(E(X?),£(X^))+ / [AE(|d^|2)+r(.,E(Xr),£(Xr),E(d^))](i. 



< E 



g^{U^,CiX^))+ r{s,U2,CiX^),0)ds 



The main point in [6] is to deduce that 



T 



A 



E(X^)a,5"(0,5E(xj))+ / [-E{\a^\^)+K{X^)d,r{s,0,6^^x^),0)]ds<c. 



(76) 



^0<s<T \^s I J 



< +00. 



By the sign property in (i) and (ii), (j7i]) follows. We deduce that E[supQ 
We also deduce that the processes (X"')„>i are tight. 

Once we have a uniform estimate for the moments of the (X")„,>i , we are led back to 
FBSDEs of the standard Markovian type, each of them appearing as the adjoint system of a 
standard optimal stochastic control problem. Standard FBSDE arguments then imply that 
|it"'(t,x)| < c(l + \x\). Uniform continuity of the family (u")n>i on compact subsets follows 
from the smoothing effect of heat kernels, see [9]. 

Let {u,n) be the limit of a convergent subsequence which we still denote (u^ , C{2L"'))n>i- 
The smoothing effect of heat kernels can also be used to prove that the functions («")«,>! are 
uniformly locally Lipschitz-continuous in space, so that u itself is locally Lipschitz-continuous 
in space. Since u grows at most linearly, the SDE 



dXt = B(t,Xt,u{t,Xt),Ht)dt + adWt, 
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is strongly solvable. We can also solve the backward equation 



dYt = -F{t,Xt,Yt,fit)dt + ZtdWt, Yt = GiXT^liT). 

By stability of forward and backward equations separately, it is plain to see that X_ is the 
limit of the (X")„.>i and that Y_ is the limit of the {{u"'{t, X"))o<f<T)n>i- Passing to the limit 
in the approximating forward-backward systems, we conclude that (^,11) solves the adjoint 
equation in (fT9]) . □ 



6.3 Solvability of the adjoint Equations of the MKV Control Problem 

In the framework of the optimal control of MKV processes, the forward-backward system 
is given by ([23]). As for the MFG approach, it can be recast as a system of the type ([70]) . 
the coefficients of the backward component depending on the law of Y through the term 
¥.{dA{t,XtMi^{Xt)],at)Yt]d^,^{Xt). We claim: 

Theorem 6.3. In the framework of Theorem \2.1\ assume that 7 and C, are twice- differentiable 
convex functions, with bounded second-order derivatives and that b, f and g are twice- differen- 
-tiable functions with bounded second-order derivatives that satisfy (Al) - (A3). Assume in 
addition that b is linear with respect to x, x' and a, namely b{t,x,x' ,a) = 6o(^) + bi{t)x + 
b2{t)x' + b^{t)a, where bo, bi, 62 and 63 are bounded functions. Assume furthermore that 
'4'iO = £, so that {ip,n) stands for the expectation of fj,. Assume finally that f is \-convex 
with respect to a, for some A > 0, so that H in (j27p is also X-convex with respect to a. 

Then, the forward-backward system (j24p . with at = a{t, Xt,Yt, C{Xt)) , is uniquely solv- 
able, where a{t, x, y, fj.) = argmin^ H{t, x, {ip, fj,), (7, fJ-),y, a). 

Sketch of Proof. We first notice that the function d is well-defined and Lipschitz-continuous 
w.r.t to X, y and fx. As in the proof of Theorem l6.2l we consider sequences of convex functions 
if^)n>ii (9")n>ii (7")n>i and (C")n>i) with bounded first and second-order derivatives, that 
converge towards /, g, 7 and ( respectively. As explained in [5], we can choose {f^)n>i, 
(5")n>i) (7")n>i and (C")n>i to be non-decreasing sequences. For each n > 1, we define the 
Hamiltonian H"" by replacing / by /" in (j27p . and we consider the forward-backward system 
m by replacing (/, g, 7, C) therein by (/", 5", 7", C) and at by af = a'^it, Xt, Yt, C{Xt)), the 
argument of the minimum of H'^{t,x, (7, /x), y, a): it is the forward-backward system 

associated with the Hamiltonian We denote by (X", y") its solution. 

By the stochastic Pontryagin principle, the process (a")o<t<r minimizes the cost function 
(j2ip . with /, g, 7 and C therein replaced by g'^, 7" and (^". In particular, J"'(a") < 
J'^{a^~^^), for p > 0. Actually, by A-convexity of /" with respect to a, we can even write, as 
a variant of the stochastic Pontryagin principle, 

J" {a"") + AE / - a^+fpds < J" (a"+P) . 
Jo 

Since the sequences (/")ra>i and ig^)n>i are non-decreasing, we deduce that, for any admis- 
sible control J'^(^) < J'^+Pi^). Therefore, 

J"(a") + AE / - a^+^l^ds < J"+p(q"+p). 
Jo 
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In particular, the sequence (J"(a"))„>i is non-decreasing. Thus, it has a Umit since it is 
bounded by sup^>x( J"(0)) < oo. We deduce that (a")n>i is a Cauchy sequence. By stabihty 
of standard McKean Vlasov SDEs, we deduce that that 

hm supEf sup IX^+P-X^I"^] =0. 

n-s>+oo 0<s<T 

This proves that the processes (X")„>i converge for the norm E[supg<3<2- 1 • p]"*^/^, towards a 
continuous adapted process X. By standard results of stability for BSDEs, we then deduce 
that the same holds for the processes (ll")n>i, that is 

lim sup Ef sup |y,"+p - = 0. 

n^+oOp>i 0<s<T 

One then deduce the existence of the limit Y_, and the fact that (X, Y_) satisfies the forward- 
backward system (j24|) . Uniqueness easily follows from the stochastic Pontryagin principle. □ 

Remark 7. Uniqueness in Theorem. 1 6. 3\ says that, for any t € [0,T], Yt is a function of 
Xf and of the law of Xf, as in (j72p . This also follows from a standard change of filtration, 
as usually done in the theory of BSDEs. In particular, the optimal control a is a feedback 
control, as expected. 
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